Mechanistic Phenotypes: An Aggregative Phenotyping Strategy to Identify Disease Mechanisms Using GWAS Data

نویسندگان

  • Jonathan D. Mosley
  • Sara L. Van Driest
  • Emma K. Larkin
  • Peter E. Weeke
  • John S. Witte
  • Quinn S. Wells
  • Jason H. Karnes
  • Yan Guo
  • Lisa Bastarache
  • Lana M. Olson
  • Catherine A. McCarty
  • Jennifer A. Pacheco
  • Gail P. Jarvik
  • David S. Carrell
  • Eric B. Larson
  • David R. Crosslin
  • Iftikhar J. Kullo
  • Gerard Tromp
  • Helena Kuivaniemi
  • David J. Carey
  • Marylyn D. Ritchie
  • Josh C. Denny
  • Dan M. Roden
چکیده

A single mutation can alter cellular and global homeostatic mechanisms and give rise to multiple clinical diseases. We hypothesized that these disease mechanisms could be identified using low minor allele frequency (MAF<0.1) non-synonymous SNPs (nsSNPs) associated with "mechanistic phenotypes", comprised of collections of related diagnoses. We studied two mechanistic phenotypes: (1) thrombosis, evaluated in a population of 1,655 African Americans; and (2) four groupings of cancer diagnoses, evaluated in 3,009 white European Americans. We tested associations between nsSNPs represented on GWAS platforms and mechanistic phenotypes ascertained from electronic medical records (EMRs), and sought enrichment in functional ontologies across the top-ranked associations. We used a two-step analytic approach whereby nsSNPs were first sorted by the strength of their association with a phenotype. We tested associations using two reverse genetic models and standard additive and recessive models. In the second step, we employed a hypothesis-free ontological enrichment analysis using the sorted nsSNPs to identify functional mechanisms underlying the diagnoses comprising the mechanistic phenotypes. The thrombosis phenotype was solely associated with ontologies related to blood coagulation (Fisher's p = 0.0001, FDR p = 0.03), driven by the F5, P2RY12 and F2RL2 genes. For the cancer phenotypes, the reverse genetics models were enriched in DNA repair functions (p = 2×10-5, FDR p = 0.03) (POLG/FANCI, SLX4/FANCP, XRCC1, BRCA1, FANCA, CHD1L) while the additive model showed enrichment related to chromatid segregation (p = 4×10-6, FDR p = 0.005) (KIF25, PINX1). We were able to replicate nsSNP associations for POLG/FANCI, BRCA1, FANCA and CHD1L in independent data sets. Mechanism-oriented phenotyping using collections of EMR-derived diagnoses can elucidate fundamental disease mechanisms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multidimensional Clinical Phenotyping of an Adult Cystic Fibrosis Patient Population

BACKGROUND Cystic Fibrosis (CF) is a multi-systemic disease resulting from mutations in the Cystic Fibrosis Transmembrane Regulator (CFTR) gene and has major manifestations in the sino-pulmonary, and gastro-intestinal tracts. Clinical phenotypes were generated using 26 common clinical variables to generate classes that overlapped quantiles of lung function and were based on multiple aspects of ...

متن کامل

A Probabilistic Model for COPD Diagnosis and Phenotyping Using Bayesian Networks

Introduction: This research was meant to provide a model for COPD diagnosis and to classify the cases into phenotypes; General COPD, Chronic bronchitis, Emphysema, and the Asthmatic COPD using a Bayesian Network (BN). Methods: The model was constructed through developing the Bayesian Network structure and instantiating the parameters for each of the variables. In order to validate the achiev...

متن کامل

Genome-wide association and high-resolution phenotyping link Oryza sativa panicle traits to numerous trait-specific QTL clusters

Rice panicle architecture is a key target of selection when breeding for yield and grain quality. However, panicle phenotypes are difficult to measure and susceptible to confounding during genetic mapping due to correlation with flowering and subpopulation structure. Here we quantify 49 panicle phenotypes in 242 tropical rice accessions with the imaging platform PANorama. Using flowering as a c...

متن کامل

On the Analysis of a Repeated Measure Design in Genome-Wide Association Analysis

Longitudinal data enables detecting the effect of aging/time, and as a repeated measures design is statistically more efficient compared to cross-sectional data if the correlations between repeated measurements are not large. In particular, when genotyping cost is more expensive than phenotyping cost, the collection of longitudinal data can be an efficient strategy for genetic association analy...

متن کامل

Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank

UNLABELLED BACKGROUND The ability to conduct genome-wide association studies (GWAS) has enabled new exploration of how genetic variations contribute to health and disease etiology. However, historically GWAS have been limited by inadequate sample size due to associated costs for genotyping and phenotyping of study subjects. This has prompted several academic medical centers to form "biobanks...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013